The effect on the equilibrium sickle cell allele frequency of the probable protection conferred by malaria and sickle cell gene against other infectious diseases

If a mutated gene with heterozygous advantage against malaria, e.g., hemoglobin S (HbS) gene, is introduced in a small tribe, the gene (allele) frequency (fgene) increases until it reaches a steady state value (feq) where the total mortality from malaria and sickle cell disease is a minimum. This is a classic example of balanced-polymorphism named malaria hypothesis. In a previous in silico study, assuming realistic initial conditions, it has been shown that the feq is around 14%, far less than the fgene observed in certain parts of Africa, 24%. It seems that the malaria hypothesis, per se, could not explain such a high fgene, unless it is assumed that malaria and HbS gene can provide protection against other diseases. Using Monte-Carlo simulation, the current study was conducted to examine the effect on feq of five scenarios was examined. The studied scenarios consisted of different combinations of mortality of other diseases and the possible amounts of protections conferred by malaria and HbS gene against the diseases. Taking into account other diseases causing mortality in the population makes the fgene rate of change steeper over generations. feq is an increasing function of the amount of protection conferred by HbS gene against other diseases. The effect of protection provided by malaria against other diseases on feq, is however, variable—depending on the amount of protection conferred by HbS gene against other diseases, it may increase or decrease feq. If malaria and HbS gene provide protections of 1.5-fold and threefold against other diseases, respectively, the feq is around 24%, the amount reported in certain tribes of Africa. Under certain scenarios, the feq attained is even higher.

Sickle-cell anemia is an example of autosomal recessive monogenic hereditary hemoglobinopathies.It is caused by a point mutation at the 6th position of the amino acid sequence of β-globin, where glutamic acid is substituted by valine 1 .The disease has a certain distribution across the globe, being highly prevalent in some places and extremely rare in other regions 2 .In certain parts of Africa, 40% of the population have sickle cell trait (heterozygous form of the disease) 3,4 ; 4%, have sickle cell disease (homozygous form of the disease).This gives a gene (allele) frequency (f gene ) of 24% [5][6][7] .With no treatment, most of children with sickle cell disease cannot survive to the reproductive age.Nevertheless, the f gene has remained high over several decades 6,8 .
In 1946, Beet reported lower malarial infection rates among carriers of the sickle cell trait compared with non-sicklers 9 .Eight years later, Allison proposed that carriers of sickle cell gene are resistant to fatal falciparum malaria 3,7,10 .By the end of 1960's, it was generally accepted that the high hemoglobin S (HbS) f gene in certain parts of the world (e.g., Africa) is attributed to the advantage conferred by the HbS gene against malaria.This relationship became a classic example of "balanced polymorphism" in man, which is known as "malaria hypothesis"; f gene for the advantageous heterozygous state increases until its incidence is balanced by the loss of homozygotes due to sickle cell disease complications 2 .
In a recent article, I have shown that to effectively provide protection against malaria within a short period, the HbS gene mutation needs to be happened in a small tribe with about 50 people at the reproductive age; the process would take more than 2000 years in a large population 11 .The f gene increases until it reaches a steady state value (f eq ).f gene will, however, not constant thereafter; it fluctuates around f eq -a phenomenon termed "genetic www.nature.com/scientificreports/drift" 12 .Under a realistic scenario, f eq is around 14% 11 .f eq rarely exceeds 15%, even in populations under intense malaria selection 13 .Even if the genetic drift is taken into account, the probability that the f gene reaches the observed value of 24% or more, reported in certain tribes of Africa [5][6][7] , is very low (~ 0.005) 11 .f eq could however reach the observed value of 24%, if the HbS gene and malaria confer protection against or reduce the mortality from other diseases prevalent in the region 11,[14][15][16][17][18] .Using Monte-Carlo simulation, this in silico study was conducted to determine the effect of five scenarios on f eq .The scenarios consisted of different combinations of mortality rates of other diseases and the possible amounts of protections provided by malaria and the HbS gene against these diseases.

Monte-Carlo simulation
The methodology used in the current study was basically similar to that employed in the previous research 11 .In the Monte-Carlo simulation, the model parameters are considered stochastic or random variables; the technique involves running the model several times, each time using a set of input values randomly drawn from a set of possible values to determine various possible outcomes 19 .To have a valid realistic simulation, we need to identify the important variables and estimate their effects on the process.

Basic parameters
In a recent study, I examined the conditions under which the malaria hypothesis can best work 11 .Herein, five scenarios were investigated.All the scenarios were modifications of scenario 6 of the previous work 11 , where to adopt agricultural life, a tribe of 150 hunter-gatherers with 25 couples at reproductive age (the effective population size of 50) decided to settle nearby water where malaria and its associated conditions killed about 15% of their children before the reproductive age 11 .Such a transition from the hunter-gatherer to farmer life style happened around 4000-5000 years ago in western and Central Africa 20,21 .

Number of children
The average number of children of hunter-gatherers was around five for each couple 22 .With such spacing, parents could carry the youngest child, while the older children could walk and follow the tribe.Children were breastfed for a longer period, which decreased the likelihood of another pregnancy 23 .More than 50% of the children died early before the reproductive age.In this way, the average number of five children for each couple kept the hunter-gatherers population size almost stationary 22 .In this simulation, variable number of children for each couple was assumed so that 10%, 15%, 50%, 15%, and 10% of the hunter-gatherer couples gave birth to 2, 3, 4, 5, and 6 children, respectively.This gives an average number of four children for each hunter-gatherer couple.With abundance of food after the hunter-gatherers settled and became farmer, they gave birth to more children, presumably an average of five children for each couple.Therefore, in the current simulation, it was assumed that from the 5th generation onward, when the transition from the hunter-gatherer to farmer population has completed, 10%, 15%, 50%, 15%, and 10% of the farmer couples gave birth to 3, 4, 5, 6, and 7 children, respectively 22 .

Population size
While the population size of the hunter-gatherers was almost stationary 22,24,25 , with abundance of food after a few generations, the number of children increased and the population of farmers grew.However, the population size did not grow indefinitely because of the limited resources available.It was assumed that the population grew until it reached a maximum of 6000 (an effective population size of 1000 couples) 11 .The growth was estimated by the following logistic function after the start of the growth (the 5th generation): where N 0 and N t represent the number of couples in reproductive age at the first five generations and t generations after the start of the growth, respectively.It was also assumed that there was cross-generational mating so that 5% of the parent population mated with offspring populations.

Other parameters
It was assumed that an advantageous mutated gene (e.g., HbS) occurred in one of the 50 people at the reproductive age, hence, a starting f gene of 1%.While 85% of those homozygous for the gene (SS genotype) died of the disease complications before the reproductive age 26,27 , it was assumed that compared to normal people (AA genotype), gene carriers (AS and SS genotypes) conferred a tenfold protection against the fatal malaria 10,11,20,28 .
That was a brief description of scenario 6 in the previous study 11 .However, the scenarios studied herein were a little bit different; it was assumed that the malaria had a constant prevalence (pr m ) of 40%, that children died of other diseases before the reproductive age with a cumulative probability (M O ) of either 0% (scenario 1) or 25% (scenarios 2-5), and that malaria and HbS gene conferred protection against other diseases by a factor of P m,O and P S,O , respectively, the amounts of which varied from scenario to scenario (Table 1).The cumulative mortality from malaria and its associated conditions before the reproductive age was kept at 15% for all scenarios (like scenario 6 of the previous study) so that the results were comparable 11 .In the current simulation, the protections provided by AS and SS genotypes against other diseases were considered equal.
The protection factor conferred against malaria by the AS and SS genotypes (P S,m ), was assumed to be equal to 10 for both, like scenario 6 of the previous study 11 .The probabilities of death for each genotype are then: (1) 1000N 0 e 0.15t 1000 + N 0 (e 0.15t − 1) , where M x designates the probability of death in a person with genotype x; pr m , the prevalence of malaria; M m , the probability of death from malaria (in this simulation, pr m × M m was assumed to be 15% so that the cumulative probability of death from malaria and its associated disorders before the reproductive age was kept constant so that the results of different scenarios were comparable); M O and M S , the cumulative probabilities of death from other diseases and sickle cell disease (SS genotype) before the reproductive age, respectively; P S,m , the protection conferred by AS and SS genotypes against malaria, assumed to be 10 for both; and P S,O and P m,O , the protections conferred the HbS gene carriers and malaria against other diseases, respectively.The fitness (W ) for AS and SS genotypes relative to AA genotype are then 11 : and

Algorithm
The algorithm used in the study was basically similar to that employed in the previous article with minor modifications 11 .The pseudo-code of the simulation program used is shown in Table 2.In step 1, a 50-element array corresponding to a 50-person population (25 men and 25 women in reproductive age) was defined.The elements of the array reflected the genotype of the corresponding person (0 = AA, 1 = AS, 2 = SS).For the zeroth generation (the very first parent population), only one of the 50 persons was assumed to have the AS genotype.
In step 2, f gene and the frequencies of homozygous (f SS ) and heterozygous (f AS ) individuals in the population at the zygotic level were calculated.Then, 40% (pr m , the prevalence of malaria assumed in this study) of people randomly selected from the population were supposed to have malaria.
In step 3, based on the genotype, presence or absence of malaria in each person in the population, and the amounts of protections conferred by HbS gene against malaria and the protections provided by malaria and HbS against other diseases, the probability of death before the reproductive age from either sickle cell disease, malaria, or other diseases was computed for each of the population members.
In step 4 of the simulation, 5% of the parent population was replaced by the members selected at random from the grandparent population (cross-generational mating).
In step 5, the frequency of parent population who died was calculated.All the calculated frequencies (steps 2 and 5) were then recorded for further analysis (step 6).In step 7, it was checked if any gene carriers (AS or SS genotypes) still remained in the parent population after steps 3 and 4. If no HbS gene carriers remained or if the number of people in the parent population was less than two, it was concluded that the HbS gene was aborted and the generation at which it happened was recorded; else, the program proceeded to the next step.
In step 8, to give an equal chance to each survivor in the parent population to marry another, using a pseudorandom generator algorithm 29 , the population array elements were shuffled. (2) Table 1.The initial values for the simulation in various scenarios studied.For all scenarios it was assumed that the probability of death from malaria before the reproductive age was 15%; the protection against malaria conferred by AS and SS genotypes was 10, and that the probability of death from sickle cell disease (SS genotype) before the reproductive age was 85%.www.nature.com/scientificreports/ In step 9, those of the parent population corresponding to even positions of the array (0, 2, 4, etc.) mated with those corresponding to their next element (positions 1, 3, 5, etc.) in the array to produce children according to their genotypes and Mendelian inheritance; the genotype of each child was then determined.The number of children for each couple was determined at random from a lookup table determining the number of children and its probability for each couple depending on whether they were hunter-gatherer or farmer.
In step 10, the parent population size of the next generation was computed.In step 11, the members of the new parent population were selected at random from the offspring population.The whole process repeated from step 2 for 100 generations (~ 2500 years).
To eliminate the chaotic effects caused by inherent randomness of the Monte-Carlo method 19 , the arithmetic mean and the standard deviation (SD) of the values obtained from 10,000 consecutive repeats of the program was calculated.For each scenario, the mean was taken as the final refined result in each generation; the mean ± 1.96 × SD, the 95% confidence interval around the mean in each generation.

Scenarios
Five scenarios were studied (Table 1).Assuming that other diseases had no mortality, scenario 1 was equivalent to scenario 6 described in the previous article 11 .Scenario 2 was similar to scenario 1, except that each person in each generation ran a 25% risk (M O ) of death from other diseases (in addition to malaria and sickle cell disease), but neither malaria nor the HbS gene conferred protection against other diseases.Scenario 3 was similar to scenario 2, except that only malaria conferred 1.5-fold protection against other diseases (P m,O ).Scenario 4 was similar to scenario 2, except that only the HbS gene conferred threefold protection against other diseases (P S,O ).Scenario 5 was similar to scenario 2, except that both malaria and the HbS gene conferred 1.5-fold and threefold protection against other diseases (Table 1).In all scenarios, it was assumed that compared with those with AA genotype, gene carriers (AS and SS genotypes) provided a tenfold protection against malaria (P S,m ).It was also assumed that the cumulative probability of death from malaria and its associated disorders before the reproductive age was 15%.

Ethics
This in silico study did not involve any humans or animals or their tissue samples.Therefore, no institutional review board approval was necessary.

Results
Under all scenarios studied (Table 1), f gene increased over generations and reached a plateau, the f eq .Scenarios merely differed by f eq and the rate of change of f gene (Fig. 1).The f eq values obtained by the simulation (Fig. 1, horizontal dashed gray lines) were very similar to those computed by Eq. ( 4) (Table 1).

Scenario 1
Given that other diseases did not cause any mortality, scenario 1 was very similar to scenario 6 of the previous study (Fig. 1, green curve) 11 ; no surprise the f eq was 13.9% (Fig. 1), very similar to what has been reported in the previous study 11 .

Scenario 2
Here, the cumulative probability of death from other diseases before the reproductive age (M O ) was 25%.Although people died of other diseases, neither malaria nor the HbS gene conferred protection against other www.nature.com/scientificreports/diseases.The course of the curve (Fig. 1, yellow curve) was different from that in scenario 1 (Fig. 1, green curve), but the f eq did not differ from that in scenario 1 (Table 1).

Scenario 3
This scenario was similar to scenario 2, except that malaria conferred 1.5-fold protection against other diseases, but no protection was provided by the HbS gene against other diseases; the f eq was a little bit more than that observed for scenarios 1 and 2 (Table 1).

Scenario 4
This scenario was similar to scenario 2, except that the HbS gene conferred threefold protection against other diseases, but no protection was provided by malaria against other diseases; the f eq was almost 26% (Fig. 1, magenta curve).

Scenario 5
The scenario was a combination of scenarios 3 and 4-malaria provided 1.5-fold and the HbS gene conferred threefold protection against other diseases.The f eq was almost 25%, lower than that in scenario 4 (Fig. 1, blue curve).

The equilibrium gene (Allele) frequency
F eq depends on both P m,O and P S,O (Fig. 2).However, when ∂f eq / ∂P m,O vanishes, f eq is independent of P m,O , which is when:  Green curve is the f gene under scenario 1 (Table 1, scenario 6 of the previous article 11 ) where the mortality of other diseases was considered zero; yellow curve, scenario 2, where although other diseases killed 25% of people before the reproductive age, neither malaria nor the hemoglobin S (HbS) gene conferred protection against other diseases; red curve, scenario 3 (only malaria protection); magenta curve, scenario 4 (only HbS gene protection); and blue curve, scenario 5 (malaria and HbS gene protections).The shaded regions are the 95% confidence intervals for the f gene under scenarios studied.The horizontal dashed gray lines represent the equilibrium f gene computed from Eq. 4 (Table 1).The f gene in scenarios 1 and 2 are similar to that has been reported for scenario 6 of the previous article 11 .
Plugging in the values used for this simulation, the P * S,O is almost 1.25; f eq is stationary for all values of P m,O (Fig. 3, horizontal gray line).For P S,O less than this value, f eq is an increasing function of P m,O (Fig. 3, green curve); otherwise, it is decreasing (Fig. 3).f eq is an increasing function of P S,O for all values of P m,O (Figs. 2, 4).

Discussion
The results of scenario 1, where there was no mortality from other diseases, were expectedly similar to scenario 6 of the previous study 11 .In scenario 2, the cumulative probability of death from other diseases before the reproductive age was 25%, but neither malaria nor HbS gene conferred protection against other diseases.Therefore,  www.nature.com/scientificreports/as expected, the f eq was the same as that in scenario 1; the only determinants of f eq were the protection provided by AS and SS genotypes against malaria.Presence of a disease with high mortality increased the rate of change of f gene over generations (Fig. 1, all but the green curve).Certain disease conditions may provide protection against other diseases or decrease their mortality and morbidity rates.For instance, it has been shown that α + -thalassemia confer protection against malaria as well as other infectious diseases 30 .A cohort study has shown that the presence of AS genotype is associated with a decreased all-cause mortality among young children 14 .A recent systematic review revealed that sickle cell disease confers protection against human immunodeficiency virus (HIV) infection 16 .Sickle cell disease can also provide protection against the intra-erythrocytic parasite, Babesia 17 .Presence of HbS has also been shown to be associated with a lower mortality and morbidity rate from severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection 18 .The non-specific protection does not limited to hemoglobinopathies.An ecological study shows the association between the prevalence of malaria and the cumulative incidence of SARS-CoV-2 infection, an observation that is biologically plausible 15 .The plasmodium double-stranded DNA and hemozoin can trigger Toll-like receptor 7 with resultant activation of intracellular downstream signaling cascade reactions leading to production of type I interferons and pro-inflammatory cytokines resulting in short-term non-specific protection against other infectious diseases 31,32 .Therefore, it seems that both malaria and HbS gene may confer non-specific protection against other diseases.
Protection conferred by the HbS gene (P S,O > 1) should expectedly increase f eq 11 ; this is in keeping with the results of the current study too (Figs.2, 4).This is why the f eq in scenario 4 (P S,O = 3, P m,O = 1 [no protection]) is 25.7% (Fig. 1, magenta curve), which can explain the high f gene reported from certain African tribes 3,4 , QED (quod erat demonstrandum).
The protection provided by malaria is a little bit tricky.If malaria also conferred protection, such as what assumed in scenario 5 (P S,O = 3, P m,O = 1.5), the f eq decreased by 1% compared to scenario 4. At the first glance, this might seem reasonable; the protection conferred by malaria against other diseases should diminish the advantage of HbS gene against malaria; the f eq should thus decrease, as what was observed in scenario 5 compared to scenario 4. But, things are not as simple as they seem; f eq depends on the amount of protections conferred by malaria and the HbS gene.If the protection provided by the HbS gene (P S,O ) is less than a certain value,P * S,O (Eq.5), f eq is an increasing function of P m,O (Figs. 3, 4); malaria protects more people against other diseases than it kills.As a consequence, to protect people against malaria, f gene increases and a new equilibrium state will be attained.In scenario 3, the protection conferred by malaria was 1.5-fold whereas the HbS gene did not protect at (5) P * S,O = M O M m P S,m − 1 pr m − 1 + M 2 m pr m − M m P S,m pr m + 1 + P S,m M 2 m pr m − M m P S,m + pr m + P S,m .

Figure 1 .
Figure 1.Variation of the gene (allele) frequency (f gene ) over generations.The value at each point on each curve is the mean value of the f gene at each generation obtained from 10,000 repeats of the corresponding scenario.Green curve is the f gene under scenario 1 (Table1, scenario 6 of the previous article11 ) where the mortality of other diseases was considered zero; yellow curve, scenario 2, where although other diseases killed 25% of people before the reproductive age, neither malaria nor the hemoglobin S (HbS) gene conferred protection against other diseases; red curve, scenario 3 (only malaria protection); magenta curve, scenario 4 (only HbS gene protection); and blue curve, scenario 5 (malaria and HbS gene protections).The shaded regions are the 95% confidence intervals for the f gene under scenarios studied.The horizontal dashed gray lines represent the equilibrium f gene computed from Eq. 4 (Table1).The f gene in scenarios 1 and 2 are similar to that has been reported for scenario 6 of the previous article11 .

Figure 2 .
Figure 2. Equilibrium gene (allele) frequency (f eq ) for different values of protections conferred by malaria (P m,O ) and the hemoglobin S (HbS) gene (P S,O ) against other diseases.f eq is a minimum when neither malaria nor the HbS gene conferred protection (P m,O = P S,O = 1); it is a maximum when P m,O = 1 and P S,O = 4.

Figure 3 .
Figure 3.The equilibrium gene (allele) frequency (f eq ) against the amount of protection conferred by malaria against other diseases (P m,O ) for different levels of protection provided by the hemoglobin S (HbS) gene.Figures near the curves are the amount of protections provided by the HbS gene against other diseases (P S,O ).There is a certain value for P S,O (Eq.5) where the f eq is constant regardless of P m,O (horizontal gray line).

Figure 4 .
Figure 4. Equilibrium gene (allele) frequency (f eq ) for different combinations of protections conferred by malaria and the hemoglobin S (HbS) gene against other diseases.The minimum f eq corresponds to a point where neither malaria nor the HbS gene conferred protection, the white tile (P m,O = P S,O = 1); the maximum, where the protection conferred by malaria is nil (P m,O = 1), but the HbS gene provided the highest protection (P S,O = 4).The horizontal dashed line corresponds to the value derived from Eq. (5), 1.25; with P S,O < 1.25 (the first row at the bottom), f eq is an increasing function of P m,O ; for P S,O > 1.25 (other rows), it is decreasing.
11e protections conferred by AS and SS genotypes against other diseases were considered equal.M O mortality from other diseases, P m,O protection conferred by malaria against other diseases, P S,O protection conferred by AS or SS genotypes against other diseases, f eq equilibrium gene (allele) frequency.*ComputedfromEqs.(2)-(4).†Equivalentto scenario 6 in the previous article11.

Table 2 .
Pseudo-code of the simulation program.